This data set contains 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, borrower employment status, borrower credit history, and the latest payment information.

I truly wanted to understand in my analysis which party benefits most in which situtations from prosper loans: Prosper, Investor or Borrower? How easy is it for an investor to truly loose all there money on a prosper loan? What loans are the most likely to make an investor are large return? I also want to understand how specific credit ratings and score relate to borrowers characteristics. I hope I can provide you with some interesting insights into the world of prosper loans by the end of this analysis.

Analysis and Exploration of the Data

Univariate Plots

Size of Dataset and Data Types

## 'data.frame':    113937 obs. of  81 variables:
##  $ ListingKey                         : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
##  $ ListingNumber                      : int  193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
##  $ ListingCreationDate                : Factor w/ 113064 levels "2005-11-09 20:44:28.847000000",..: 14184 111894 6429 64760 85967 100310 72556 74019 97834 97834 ...
##  $ CreditGrade                        : Factor w/ 8 levels "A","AA","B","C",..: 4 NA 7 NA NA NA NA NA NA NA ...
##  $ Term                               : int  36 36 36 36 36 60 36 36 36 36 ...
##  $ LoanStatus                         : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
##  $ ClosedDate                         : Factor w/ 2802 levels "2005-11-25 00:00:00",..: 1137 NA 1262 NA NA NA NA NA NA NA ...
##  $ BorrowerAPR                        : num  0.165 0.12 0.283 0.125 0.246 ...
##  $ BorrowerRate                       : num  0.158 0.092 0.275 0.0974 0.2085 ...
##  $ LenderYield                        : num  0.138 0.082 0.24 0.0874 0.1985 ...
##  $ EstimatedEffectiveYield            : num  NA 0.0796 NA 0.0849 0.1832 ...
##  $ EstimatedLoss                      : num  NA 0.0249 NA 0.0249 0.0925 ...
##  $ EstimatedReturn                    : num  NA 0.0547 NA 0.06 0.0907 ...
##  $ ProsperRating.numeric              : int  NA 6 NA 6 3 5 2 4 7 7 ...
##  $ ProsperRating.Alpha                : Factor w/ 7 levels "A","AA","B","C",..: NA 1 NA 1 5 3 6 4 2 2 ...
##  $ ProsperScore                       : num  NA 7 NA 9 4 10 2 4 9 11 ...
##  $ ListingCategory.numeric            : int  0 2 0 16 2 1 1 2 7 7 ...
##  $ BorrowerState                      : Factor w/ 51 levels "AK","AL","AR",..: 6 6 11 11 24 33 17 5 15 15 ...
##  $ Occupation                         : Factor w/ 67 levels "Accountant/CPA",..: 36 42 36 51 20 42 49 28 23 23 ...
##  $ EmploymentStatus                   : Factor w/ 8 levels "Employed","Full-time",..: 8 1 3 1 1 1 1 1 1 1 ...
##  $ EmploymentStatusDuration           : int  2 44 NA 113 44 82 172 103 269 269 ...
##  $ IsBorrowerHomeowner                : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
##  $ CurrentlyInGroup                   : Factor w/ 2 levels "False","True": 2 1 2 1 1 1 1 1 1 1 ...
##  $ GroupKey                           : Factor w/ 706 levels "00343376901312423168731",..: NA NA 334 NA NA NA NA NA NA NA ...
##  $ DateCreditPulled                   : Factor w/ 112992 levels "2005-11-09 00:30:04.487000000",..: 14347 111883 6446 64724 85857 100382 72500 73937 97888 97888 ...
##  $ CreditScoreRangeLower              : int  640 680 480 800 680 740 680 700 820 820 ...
##  $ CreditScoreRangeUpper              : int  659 699 499 819 699 759 699 719 839 839 ...
##  $ FirstRecordedCreditLine            : Factor w/ 11585 levels "1947-08-24 00:00:00",..: 8638 6616 8926 2246 9497 496 8264 7684 5542 5542 ...
##  $ CurrentCreditLines                 : int  5 14 NA 5 19 21 10 6 17 17 ...
##  $ OpenCreditLines                    : int  4 14 NA 5 19 17 7 6 16 16 ...
##  $ TotalCreditLinespast7years         : int  12 29 3 29 49 49 20 10 32 32 ...
##  $ OpenRevolvingAccounts              : int  1 13 0 7 6 13 6 5 12 12 ...
##  $ OpenRevolvingMonthlyPayment        : num  24 389 0 115 220 1410 214 101 219 219 ...
##  $ InquiriesLast6Months               : int  3 3 0 0 1 0 0 3 1 1 ...
##  $ TotalInquiries                     : num  3 5 1 1 9 2 0 16 6 6 ...
##  $ CurrentDelinquencies               : int  2 0 1 4 0 0 0 0 0 0 ...
##  $ AmountDelinquent                   : num  472 0 NA 10056 0 ...
##  $ DelinquenciesLast7Years            : int  4 0 0 14 0 0 0 0 0 0 ...
##  $ PublicRecordsLast10Years           : int  0 1 0 0 0 0 0 1 0 0 ...
##  $ PublicRecordsLast12Months          : int  0 0 NA 0 0 0 0 0 0 0 ...
##  $ RevolvingCreditBalance             : num  0 3989 NA 1444 6193 ...
##  $ BankcardUtilization                : num  0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
##  $ AvailableBankcardCredit            : num  1500 10266 NA 30754 695 ...
##  $ TotalTrades                        : num  11 29 NA 26 39 47 16 10 29 29 ...
##  $ TradesNeverDelinquent.percentage   : num  0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
##  $ TradesOpenedLast6Months            : num  0 2 NA 0 2 0 0 0 1 1 ...
##  $ DebtToIncomeRatio                  : num  0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
##  $ IncomeRange                        : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
##  $ IncomeVerifiable                   : Factor w/ 2 levels "False","True": 2 2 2 2 2 2 2 2 2 2 ...
##  $ StatedMonthlyIncome                : num  3083 6125 2083 2875 9583 ...
##  $ LoanKey                            : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
##  $ TotalProsperLoans                  : int  NA NA NA NA 1 NA NA NA NA NA ...
##  $ TotalProsperPaymentsBilled         : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ OnTimeProsperPayments              : int  NA NA NA NA 11 NA NA NA NA NA ...
##  $ ProsperPaymentsLessThanOneMonthLate: int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPaymentsOneMonthPlusLate    : int  NA NA NA NA 0 NA NA NA NA NA ...
##  $ ProsperPrincipalBorrowed           : num  NA NA NA NA 11000 NA NA NA NA NA ...
##  $ ProsperPrincipalOutstanding        : num  NA NA NA NA 9948 ...
##  $ ScorexChangeAtTimeOfListing        : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanCurrentDaysDelinquent          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ LoanFirstDefaultedCycleNumber      : int  NA NA NA NA NA NA NA NA NA NA ...
##  $ LoanMonthsSinceOrigination         : int  78 0 86 16 6 3 11 10 3 3 ...
##  $ LoanNumber                         : int  19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
##  $ LoanOriginalAmount                 : int  9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
##  $ LoanOriginationDate                : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
##  $ LoanOriginationQuarter             : Factor w/ 33 levels "Q1 2006","Q1 2007",..: 18 8 2 32 24 33 16 16 33 33 ...
##  $ MemberKey                          : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
##  $ MonthlyLoanPayment                 : num  330 319 123 321 564 ...
##  $ LP_CustomerPayments                : num  11396 0 4187 5143 2820 ...
##  $ LP_CustomerPrincipalPayments       : num  9425 0 3001 4091 1563 ...
##  $ LP_InterestandFees                 : num  1971 0 1186 1052 1257 ...
##  $ LP_ServiceFees                     : num  -133.2 0 -24.2 -108 -60.3 ...
##  $ LP_CollectionFees                  : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_GrossPrincipalLoss              : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NetPrincipalLoss                : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ LP_NonPrincipalRecoverypayments    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PercentFunded                      : num  1 1 1 1 1 1 1 1 1 1 ...
##  $ Recommendations                    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsCount         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ InvestmentFromFriendsAmount        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Investors                          : int  258 1 41 158 20 1 1 1 1 1 ...

Summary Stastics of Data

Table continues below
ListingKey ListingNumber ListingCreationDate
17A93590655669644DB4C06: 6 Min. : 4 2013-10-02 17:20:16.550000000: 6
349D3587495831350F0F648: 4 1st Qu.: 400919 2013-08-28 20:31:41.107000000: 4
47C1359638497431975670B: 4 Median : 600554 2013-09-08 09:27:44.853000000: 4
8474358854651984137201C: 4 Mean : 627886 2013-12-06 05:43:13.830000000: 4
DE8535960513435199406CE: 4 3rd Qu.: 892634 2013-12-06 11:44:58.283000000: 4
04C13599434217079754AEE: 3 Max. :1255725 2013-08-21 07:25:22.360000000: 3
(Other) :113912 NA (Other) :113912
Table continues below
CreditGrade Term LoanStatus ClosedDate
C : 5649 Min. :12.00 Current :56576 2014-03-04 00:00:00: 105
D : 5153 1st Qu.:36.00 Completed :38074 2014-02-19 00:00:00: 100
B : 4389 Median :36.00 Chargedoff :11992 2014-02-11 00:00:00: 92
AA : 3509 Mean :40.83 Defaulted : 5018 2012-10-30 00:00:00: 81
HR : 3508 3rd Qu.:36.00 Past Due (1-15 days) : 806 2013-02-26 00:00:00: 78
(Other): 6745 Max. :60.00 Past Due (31-60 days): 363 (Other) :54633
NA’s :84984 NA (Other) : 1108 NA’s :58848
Table continues below
BorrowerAPR BorrowerRate LenderYield EstimatedEffectiveYield
Min. :0.00653 Min. :0.0000 Min. :-0.0100 Min. :-0.183
1st Qu.:0.15629 1st Qu.:0.1340 1st Qu.: 0.1242 1st Qu.: 0.116
Median :0.20976 Median :0.1840 Median : 0.1730 Median : 0.162
Mean :0.21883 Mean :0.1928 Mean : 0.1827 Mean : 0.169
3rd Qu.:0.28381 3rd Qu.:0.2500 3rd Qu.: 0.2400 3rd Qu.: 0.224
Max. :0.51229 Max. :0.4975 Max. : 0.4925 Max. : 0.320
NA’s :25 NA NA NA’s :29084
Table continues below
EstimatedLoss EstimatedReturn ProsperRating.numeric ProsperRating.Alpha
Min. :0.005 Min. :-0.183 Min. :1.000 C :18345
1st Qu.:0.042 1st Qu.: 0.074 1st Qu.:3.000 B :15581
Median :0.072 Median : 0.092 Median :4.000 A :14551
Mean :0.080 Mean : 0.096 Mean :4.072 D :14274
3rd Qu.:0.112 3rd Qu.: 0.117 3rd Qu.:5.000 E : 9795
Max. :0.366 Max. : 0.284 Max. :7.000 (Other):12307
NA’s :29084 NA’s :29084 NA’s :29084 NA’s :29084
Table continues below
ProsperScore ListingCategory.numeric BorrowerState
Min. : 1.00 Min. : 0.000 CA :14717
1st Qu.: 4.00 1st Qu.: 1.000 TX : 6842
Median : 6.00 Median : 1.000 NY : 6729
Mean : 5.95 Mean : 2.774 FL : 6720
3rd Qu.: 8.00 3rd Qu.: 3.000 IL : 5921
Max. :11.00 Max. :20.000 (Other):67493
NA’s :29084 NA NA’s : 5515
Table continues below
Occupation EmploymentStatus EmploymentStatusDuration
Other :28617 Employed :67322 Min. : 0.00
Professional :13628 Full-time :26355 1st Qu.: 26.00
Computer Programmer: 4478 Self-employed: 6134 Median : 67.00
Executive : 4311 Not available: 5347 Mean : 96.07
Teacher : 3759 Other : 3806 3rd Qu.:137.00
(Other) :55556 (Other) : 2718 Max. :755.00
NA’s : 3588 NA’s : 2255 NA’s :7625
Table continues below
IsBorrowerHomeowner CurrentlyInGroup GroupKey
False:56459 False:101218 783C3371218786870A73D20: 1140
True :57478 True : 12719 3D4D3366260257624AB272D: 916
NA NA 6A3B336601725506917317E: 698
NA NA FEF83377364176536637E50: 611
NA NA C9643379247860156A00EC0: 342
NA NA (Other) : 9634
NA NA NA’s :100596
Table continues below
DateCreditPulled CreditScoreRangeLower CreditScoreRangeUpper
2013-12-23 09:38:12: 6 Min. : 0.0 Min. : 19.0
2013-11-21 09:09:41: 4 1st Qu.:660.0 1st Qu.:679.0
2013-12-06 05:43:16: 4 Median :680.0 Median :699.0
2014-01-14 20:17:49: 4 Mean :685.6 Mean :704.6
2014-02-09 12:14:41: 4 3rd Qu.:720.0 3rd Qu.:739.0
2013-09-27 22:04:54: 3 Max. :880.0 Max. :899.0
(Other) :113912 NA’s :591 NA’s :591
Table continues below
FirstRecordedCreditLine CurrentCreditLines OpenCreditLines
1993-12-01 00:00:00: 185 Min. : 0.00 Min. : 0.00
1994-11-01 00:00:00: 178 1st Qu.: 7.00 1st Qu.: 6.00
1995-11-01 00:00:00: 168 Median :10.00 Median : 9.00
1990-04-01 00:00:00: 161 Mean :10.32 Mean : 9.26
1995-03-01 00:00:00: 159 3rd Qu.:13.00 3rd Qu.:12.00
(Other) :112389 Max. :59.00 Max. :54.00
NA’s : 697 NA’s :7604 NA’s :7604
Table continues below
TotalCreditLinespast7years OpenRevolvingAccounts
Min. : 2.00 Min. : 0.00
1st Qu.: 17.00 1st Qu.: 4.00
Median : 25.00 Median : 6.00
Mean : 26.75 Mean : 6.97
3rd Qu.: 35.00 3rd Qu.: 9.00
Max. :136.00 Max. :51.00
NA’s :697 NA
Table continues below
OpenRevolvingMonthlyPayment InquiriesLast6Months TotalInquiries
Min. : 0.0 Min. : 0.000 Min. : 0.000
1st Qu.: 114.0 1st Qu.: 0.000 1st Qu.: 2.000
Median : 271.0 Median : 1.000 Median : 4.000
Mean : 398.3 Mean : 1.435 Mean : 5.584
3rd Qu.: 525.0 3rd Qu.: 2.000 3rd Qu.: 7.000
Max. :14985.0 Max. :105.000 Max. :379.000
NA NA’s :697 NA’s :1159
Table continues below
CurrentDelinquencies AmountDelinquent DelinquenciesLast7Years
Min. : 0.0000 Min. : 0.0 Min. : 0.000
1st Qu.: 0.0000 1st Qu.: 0.0 1st Qu.: 0.000
Median : 0.0000 Median : 0.0 Median : 0.000
Mean : 0.5921 Mean : 984.5 Mean : 4.155
3rd Qu.: 0.0000 3rd Qu.: 0.0 3rd Qu.: 3.000
Max. :83.0000 Max. :463881.0 Max. :99.000
NA’s :697 NA’s :7622 NA’s :990
Table continues below
PublicRecordsLast10Years PublicRecordsLast12Months RevolvingCreditBalance
Min. : 0.0000 Min. : 0.000 Min. : 0
1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.: 3121
Median : 0.0000 Median : 0.000 Median : 8549
Mean : 0.3126 Mean : 0.015 Mean : 17599
3rd Qu.: 0.0000 3rd Qu.: 0.000 3rd Qu.: 19521
Max. :38.0000 Max. :20.000 Max. :1435667
NA’s :697 NA’s :7604 NA’s :7604
Table continues below
BankcardUtilization AvailableBankcardCredit TotalTrades
Min. :0.000 Min. : 0 Min. : 0.00
1st Qu.:0.310 1st Qu.: 880 1st Qu.: 15.00
Median :0.600 Median : 4100 Median : 22.00
Mean :0.561 Mean : 11210 Mean : 23.23
3rd Qu.:0.840 3rd Qu.: 13180 3rd Qu.: 30.00
Max. :5.950 Max. :646285 Max. :126.00
NA’s :7604 NA’s :7544 NA’s :7544
Table continues below
TradesNeverDelinquent.percentage TradesOpenedLast6Months DebtToIncomeRatio
Min. :0.000 Min. : 0.000 Min. : 0.000
1st Qu.:0.820 1st Qu.: 0.000 1st Qu.: 0.140
Median :0.940 Median : 0.000 Median : 0.220
Mean :0.886 Mean : 0.802 Mean : 0.276
3rd Qu.:1.000 3rd Qu.: 1.000 3rd Qu.: 0.320
Max. :1.000 Max. :20.000 Max. :10.010
NA’s :7544 NA’s :7544 NA’s :8554
Table continues below
IncomeRange IncomeVerifiable StatedMonthlyIncome
$25,000-49,999:32192 False: 8669 Min. : 0
$50,000-74,999:31050 True :105268 1st Qu.: 3200
$100,000+ :17337 NA Median : 4667
$75,000-99,999:16916 NA Mean : 5608
Not displayed : 7741 NA 3rd Qu.: 6825
$1-24,999 : 7274 NA Max. :1750003
(Other) : 1427 NA NA
Table continues below
LoanKey TotalProsperLoans TotalProsperPaymentsBilled
CB1B37030986463208432A1: 6 Min. :0.00 Min. : 0.00
2DEE3698211017519D7333F: 4 1st Qu.:1.00 1st Qu.: 9.00
9F4B37043517554537C364C: 4 Median :1.00 Median : 16.00
D895370150591392337ED6D: 4 Mean :1.42 Mean : 22.93
E6FB37073953690388BC56D: 4 3rd Qu.:2.00 3rd Qu.: 33.00
0D8F37036734373301ED419: 3 Max. :8.00 Max. :141.00
(Other) :113912 NA’s :91852 NA’s :91852
Table continues below
OnTimeProsperPayments ProsperPaymentsLessThanOneMonthLate
Min. : 0.00 Min. : 0.00
1st Qu.: 9.00 1st Qu.: 0.00
Median : 15.00 Median : 0.00
Mean : 22.27 Mean : 0.61
3rd Qu.: 32.00 3rd Qu.: 0.00
Max. :141.00 Max. :42.00
NA’s :91852 NA’s :91852
Table continues below
ProsperPaymentsOneMonthPlusLate ProsperPrincipalBorrowed
Min. : 0.00 Min. : 0
1st Qu.: 0.00 1st Qu.: 3500
Median : 0.00 Median : 6000
Mean : 0.05 Mean : 8472
3rd Qu.: 0.00 3rd Qu.:11000
Max. :21.00 Max. :72499
NA’s :91852 NA’s :91852
Table continues below
ProsperPrincipalOutstanding ScorexChangeAtTimeOfListing
Min. : 0 Min. :-209.00
1st Qu.: 0 1st Qu.: -35.00
Median : 1627 Median : -3.00
Mean : 2930 Mean : -3.22
3rd Qu.: 4127 3rd Qu.: 25.00
Max. :23451 Max. : 286.00
NA’s :91852 NA’s :95009
Table continues below
LoanCurrentDaysDelinquent LoanFirstDefaultedCycleNumber
Min. : 0.0 Min. : 0.00
1st Qu.: 0.0 1st Qu.: 9.00
Median : 0.0 Median :14.00
Mean : 152.8 Mean :16.27
3rd Qu.: 0.0 3rd Qu.:22.00
Max. :2704.0 Max. :44.00
NA NA’s :96985
Table continues below
LoanMonthsSinceOrigination LoanNumber LoanOriginalAmount
Min. : 0.0 Min. : 1 Min. : 1000
1st Qu.: 6.0 1st Qu.: 37332 1st Qu.: 4000
Median : 21.0 Median : 68599 Median : 6500
Mean : 31.9 Mean : 69444 Mean : 8337
3rd Qu.: 65.0 3rd Qu.:101901 3rd Qu.:12000
Max. :100.0 Max. :136486 Max. :35000
NA NA NA
Table continues below
LoanOriginationDate LoanOriginationQuarter MemberKey
2014-01-22 00:00:00: 491 Q4 2013:14450 63CA34120866140639431C9: 9
2013-11-13 00:00:00: 490 Q1 2014:12172 16083364744933457E57FB9: 8
2014-02-19 00:00:00: 439 Q3 2013: 9180 3A2F3380477699707C81385: 8
2013-10-16 00:00:00: 434 Q2 2013: 7099 4D9C3403302047712AD0CDD: 8
2014-01-28 00:00:00: 339 Q3 2012: 5632 739C338135235294782AE75: 8
2013-09-24 00:00:00: 316 Q2 2012: 5061 7E1733653050264822FAA3D: 8
(Other) :111428 (Other):60343 (Other) :113888
Table continues below
MonthlyLoanPayment LP_CustomerPayments LP_CustomerPrincipalPayments
Min. : 0.0 Min. : -2.35 Min. : 0.0
1st Qu.: 131.6 1st Qu.: 1005.76 1st Qu.: 500.9
Median : 217.7 Median : 2583.83 Median : 1587.5
Mean : 272.5 Mean : 4183.08 Mean : 3105.5
3rd Qu.: 371.6 3rd Qu.: 5548.40 3rd Qu.: 4000.0
Max. :2251.5 Max. :40702.39 Max. :35000.0
NA NA NA
Table continues below
LP_InterestandFees LP_ServiceFees LP_CollectionFees
Min. : -2.35 Min. :-664.87 Min. :-9274.75
1st Qu.: 274.87 1st Qu.: -73.18 1st Qu.: 0.00
Median : 700.84 Median : -34.44 Median : 0.00
Mean : 1077.54 Mean : -54.73 Mean : -14.24
3rd Qu.: 1458.54 3rd Qu.: -13.92 3rd Qu.: 0.00
Max. :15617.03 Max. : 32.06 Max. : 0.00
NA NA NA
Table continues below
LP_GrossPrincipalLoss LP_NetPrincipalLoss LP_NonPrincipalRecoverypayments
Min. : -94.2 Min. : -954.5 Min. : 0.00
1st Qu.: 0.0 1st Qu.: 0.0 1st Qu.: 0.00
Median : 0.0 Median : 0.0 Median : 0.00
Mean : 700.4 Mean : 681.4 Mean : 25.14
3rd Qu.: 0.0 3rd Qu.: 0.0 3rd Qu.: 0.00
Max. :25000.0 Max. :25000.0 Max. :21117.90
NA NA NA
Table continues below
PercentFunded Recommendations InvestmentFromFriendsCount
Min. :0.7000 Min. : 0.00000 Min. : 0.00000
1st Qu.:1.0000 1st Qu.: 0.00000 1st Qu.: 0.00000
Median :1.0000 Median : 0.00000 Median : 0.00000
Mean :0.9986 Mean : 0.04803 Mean : 0.02346
3rd Qu.:1.0000 3rd Qu.: 0.00000 3rd Qu.: 0.00000
Max. :1.0125 Max. :39.00000 Max. :33.00000
NA NA NA
InvestmentFromFriendsAmount Investors
Min. : 0.00 Min. : 1.00
1st Qu.: 0.00 1st Qu.: 2.00
Median : 0.00 Median : 44.00
Mean : 16.55 Mean : 80.48
3rd Qu.: 0.00 3rd Qu.: 115.00
Max. :25000.00 Max. :1189.00
NA NA

Credit Grades before 2009 are missing for many loans but for those that do exist a C credit rating is most prevalent. I wonder why C credit grades borrowers are granted the most loans.

Lower Credit Scores are mostly between 360 and 880. Some people have 0 for their lower credit score since if you have a credit score lower than 350 it is automatically set to zero. If you remove these 0 credit scores you are able to see wierd areas around credit score 600-620 and 780-800 that do not have any scores. I do not understand these empty spaces but overall the distribution is normal.

There are a few outlier upper credit scores of 19 but if these are removed the overall plot is normal with small holes around around credit score 600-620 and 780-800 just like in the lower credit scores.

36 months is the most prevalent loan term length. I wonder why a 36 month loan is most prevalent.

While there are many defualted or chargedoff loans, most loans are current or compeleted.

Estimated Returns seems to have normal distribution centered around .1 with some outliers that go into the negative returns.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.183   0.074   0.092   0.096   0.117   0.284   29084

Most borrowers for prosper loans reside in California. It seems that the states with the largest cities seem the have the largest density of borrowers.

Most borrowers have a debt to income around .22 but the distribution is positively skewed with many large outliers. For example, some have a D2I ratio of 10.01:1. I wonder which type of borrower takes on such a large debt to income ratio?

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0000  0.1400  0.2200  0.2759  0.3200 10.0100

Most prosper borrowers have an income in the range of $25,000-$49,999.

Most prosper borrowers are making around $4000 a month which falls right in line with a yearly income of about $48,000. At the same time one borrower had a stated monthly income of $1,750,000. Could this outlier be an error since this person would potentially be making over $21,000,000 a year need a loan for $4000?

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       0    3200    4667    5608    6825 1750000
##   LoanOriginalAmount
## 1               4000

There seems to be spikes in specific loan amounts which may stem from the way prosper creates loans. Also the distribution for prosper loans seems to be positively skewed with a media nof $6500 and a maximum of $35000.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    6500    8337   12000   35000

Closed Dates do not seem to have any outliers but it has a large dip in the amount of closed loans in 2012. I wonder why that is.

##         Min.      1st Qu.       Median         Mean      3rd Qu. 
## "2005-11-25" "2009-07-14" "2011-04-05" "2011-03-07" "2013-01-30" 
##         Max. 
## "2014-03-10"

Borrower APR does not have many outliers with a median APR of .2098. The distribution seems to be normal with limited spikes at .3 and .38.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.15630 0.20980 0.21880 0.28380 0.51230

Most borrowers are listed under debt consolidation.

## 
##     0     1     2     3     4     5     6     7     8     9    10    11 
## 16965 58308  7433  7189  2395   756  2572 10494   199    85    91   217 
##    12    13    14    15    16    17    18    19    20 
##    59  1996   876  1522   304    52   885   768   771

Almost all borrowers have no recommendations when they get a loan from prosper loans which is staggering. Even when you remove all those with no recommendations you still have a majority with only 1 recommendation. It seems most people who get loans on prosper loans do not have many people who think highly of them or having recommendations is not very important to lenders.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##  0.00000  0.00000  0.00000  0.04803  0.00000 39.00000
## 
##      0      1      2      3      4      5      6      7      8      9 
## 109678   3516    568    108     26     14      4      5      3      6 
##     14     16     18     19     21     24     39 
##      1      2      2      1      1      1      1

The current credit lines distribution is very smooth with a slight positive skew since there are quiet a few large upper outliers. The median is 10 current credit lines.

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    7.00   10.00   10.32   13.00   59.00

Almost all borrowers are full-time or employed. It seems having a steady job is very important to prove to lenders that you will pay off your loans.

##      Employed     Full-time Not available  Not employed         Other 
##         67322         26355          5347           835          3806 
##     Part-time       Retired Self-employed 
##          1088           795          6134

The investment from friends seems to be very positively skewed. The distribation is easier to see if it is placed on a a log scale.

Bivariate Plots

Leading up to the 2008 financial crisis there seemed to be a large increas in defaulted loans. Those loans seemed to convert over to charged off loans after 2008 and these defaulted loans then increased in number up to around 2009 where they slowly decreased in amount as completed loans greatly increased on number. It seems the drastic decrease in completed loans lead to the large overall decrease in number of closed loans in 2012. After this there is an increase in charged of and completed loans up to the present.

Most of the largely correlated items in the correlation matrix are interelated and do not shead large insights into relationships in the data alone.

Those with a monthly income less than about $8334 seem unable to get a loan larger than $25000. I wonder if prosper loans have made a maximum loan amount $25,000 for lower income borrowers?

##    StatedMonthlyIncome LoanOriginalAmount
## 1             8333.333              35000
## 2             8333.333              30000
## 3             8333.333              35000
## 4             8333.333              30000
## 5             8333.333              35000
## 6             8333.333              35000
## 7             8333.333              35000
## 8             8333.333              35000
## 9             8333.333              30000
## 10            8333.333              30000
## 11            8333.333              30000
## 12            8333.333              35000
## 13            8333.333              30000
## 14            8333.333              28000
## 15            8333.333              28000
## 16            8333.333              35000
## 17            8333.333              35000
## 18            8333.333              30000
## 19            8333.333              30000
## 20            8333.333              30000
## 21            8333.333              35000
## 22            8333.333              34700
## 23            8333.333              35000
## 24            8333.333              32500
## 25            8333.333              35000
## 26            8333.333              35000
## 27            8333.333              35000
## 28            8333.333              32500
## 29            8333.333              30000
## 30            8333.333              35000
## 31            8333.333              30000
## 32            8333.333              30000
## 33            8333.333              28000
## 34            8333.333              35000
## 35            8333.333              28500
## 36            8333.333              35000
## 37            8333.333              30000
## 38            8333.333              30000

If you look at how much money is estimated to be made on average from a loan, borrowers with a C ratings tend to give the largest returns.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    14.6   288.8   579.1   639.2   853.5  2831.0 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   25.52  382.20  708.00  807.60 1090.00 3055.00 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    -8.0   516.6   894.9   998.8  1358.0  3570.0 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   -60.0   509.8   927.4  1007.0  1339.0  4118.0 
## 
## $D
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
##   -5.168  420.400  756.300  832.900 1146.000 3096.000 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -21.42  404.10  480.90  567.60  704.40 2930.00 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -1656.0   311.5   453.9   395.0   498.4  2029.0 
## 
## $NC
## NULL

It seems the highest estimated return in-terms of percent increases as loans get higher risk with worst borrower credit rating, which makes sense since APR increases aswell. This can be seen with a F value of 13792 with an extremely low p value between all pairs from an analysis of variance POST HOC test.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01460 0.04554 0.05100 0.05399 0.05540 0.19360 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01780 0.06081 0.06663 0.06965 0.07284 0.18310 
## 
## $B
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.00100  0.07408  0.08215  0.08629  0.09260  0.28370 
## 
## $C
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.00910  0.08227  0.09220  0.09810  0.11050  0.26670 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.0045  0.1012  0.1163  0.1187  0.1414  0.2332 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.0124  0.1054  0.1239  0.1247  0.1487  0.1843 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.1827  0.1135  0.1221  0.1136  0.1246  0.1399 
## 
## $NC
## NULL
##                    Df Sum Sq Mean Sq F value Pr(>F)    
## AllCreditGrades     6  38.73   6.454   13792 <2e-16 ***
## Residuals       84846  39.71   0.000                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 29084 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = EstimatedReturn ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##               diff          lwr          upr p adj
## A-AA   0.015656356  0.014638105  0.016674606     0
## B-AA   0.032301445  0.031292310  0.033310580     0
## C-AA   0.044112993  0.043123541  0.045102445     0
## D-AA   0.064737053  0.063716141  0.065757964     0
## E-AA   0.070700639  0.069617781  0.071783496     0
## HR-AA  0.059635718  0.058476470  0.060794967     0
## B-A    0.016645089  0.015909794  0.017380385     0
## C-A    0.028456637  0.027748597  0.029164678     0
## D-A    0.049080697  0.048329321  0.049832073     0
## E-A    0.055044283  0.054210684  0.055877882     0
## HR-A   0.043979363  0.043048684  0.044910042     0
## C-B    0.011811548  0.011116681  0.012506415     0
## D-B    0.032435608  0.031696633  0.033174583     0
## E-B    0.038399194  0.037576755  0.039221632     0
## HR-B   0.027334273  0.026413577  0.028254970     0
## D-C    0.020624060  0.019912198  0.021335921     0
## E-C    0.026587646  0.025789480  0.027385811     0
## HR-C   0.015522725  0.014623645  0.016421805     0
## E-D    0.005963586  0.005126739  0.006800432     0
## HR-D  -0.005101334 -0.006034924 -0.004167745     0
## HR-E  -0.011064920 -0.012065875 -0.010063966     0

Of loans that are completed and not charged off or defaulted, the most return on average actually comes from D CreditGrade grade loans with a mean value of $1293.00.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -7221.0   137.3   429.3   871.6  1103.0 13230.0 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -8444.0   255.9   640.0  1030.0  1338.0 15700.0 
## 
## $B
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -13570.0    390.0    914.7   1312.0   1794.0  10690.0 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -3000.0   362.9   844.2  1218.0  1606.0 11810.0 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -8819.0   467.7   974.4  1293.0  1777.0 13010.0 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -2904.0   411.6   829.3  1108.0  1522.0 10800.0 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  -512.5   387.5   729.5   921.1  1270.0  9164.0 
## 
## $NC
## NULL

Seems Chargedoff loans truly began during the 2008 recession.

Seems that people who chargeoff or defualt on their loans have about the same median amount of current credit lines as those who complete their loans.

## $Cancelled
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##       9       9       9       9       9       9 
## 
## $Chargedoff
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   5.000   8.000   8.846  12.000  48.000 
## 
## $Completed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   0.000   6.000   9.000   9.692  13.000  59.000 
## 
## $Current
## NULL
## 
## $Defaulted
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    6.00   10.00   10.64   14.00   52.00 
## 
## $FinalPaymentInProgress
## NULL
## 
## $`Past Due (>120 days)`
## NULL
## 
## $`Past Due (1-15 days)`
## NULL
## 
## $`Past Due (16-30 days)`
## NULL
## 
## $`Past Due (31-60 days)`
## NULL
## 
## $`Past Due (61-90 days)`
## NULL
## 
## $`Past Due (91-120 days)`
## NULL

As you credit grade gets better your APR gets lower. This relationship is not do to randomn variance based since these two varaibles have a F value of 55574 and extremely low p values for all pairs in a POST HOC test. Another important oberservation is that even with a specific credit grade there are a wide range of APR possible. Even with an HR credit grade an borrower can get an loan APR around 0.00864.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01650 0.08325 0.09136 0.09641 0.10140 0.33170 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01315 0.12450 0.13710 0.13830 0.15040 0.36620 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01325 0.16650 0.17750 0.17970 0.19500 0.37630 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.20040 0.22110 0.21840 0.24200 0.40240 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00653 0.24610 0.27470 0.26600 0.29510 0.41360 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01657 0.30130 0.32440 0.31550 0.34620 0.41360 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00864 0.30550 0.35640 0.32760 0.35800 0.51230 
## 
## $NC
## NULL
##                     Df Sum Sq Mean Sq F value Pr(>F)    
## AllCreditGrades      7  568.3   81.18   55574 <2e-16 ***
## Residuals       113773  166.2    0.00                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 156 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = BorrowerAPR ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##              diff          lwr         upr   p adj
## A-AA   0.04190859  0.040403734  0.04341344 0.0e+00
## B-AA   0.08329820  0.081819862  0.08477653 0.0e+00
## C-AA   0.12203080  0.120591084  0.12347051 0.0e+00
## D-AA   0.16957414  0.168089480  0.17105881 0.0e+00
## E-AA   0.21910684  0.217513370  0.22070031 0.0e+00
## HR-AA  0.23118069  0.229507762  0.23285362 0.0e+00
## NC-AA  0.13860918  0.128741704  0.14847666 0.0e+00
## B-A    0.04138961  0.040196623  0.04258259 0.0e+00
## C-A    0.08012221  0.078977432  0.08126699 0.0e+00
## D-A    0.12766556  0.126464737  0.12886637 0.0e+00
## E-A    0.17719825  0.175865254  0.17853125 0.0e+00
## HR-A   0.18927210  0.187845067  0.19069914 0.0e+00
## NC-A   0.09670060  0.086871817  0.10652937 0.0e+00
## C-B    0.03873260  0.037622914  0.03984229 0.0e+00
## D-B    0.08627595  0.085108533  0.08744336 0.0e+00
## E-B    0.13580864  0.134505657  0.13711163 0.0e+00
## HR-B   0.14788249  0.146483451  0.14928154 0.0e+00
## NC-B   0.05531099  0.045486234  0.06513574 0.0e+00
## D-C    0.04754334  0.046425239  0.04866145 0.0e+00
## E-C    0.09707604  0.095817042  0.09833504 0.0e+00
## HR-C   0.10914989  0.107791722  0.11050806 0.0e+00
## NC-C   0.01657838  0.006759368  0.02639740 8.6e-06
## E-D    0.04953270  0.048222535  0.05084286 0.0e+00
## HR-D   0.06160655  0.060200820  0.06301228 0.0e+00
## NC-D  -0.03096496 -0.040790667 -0.02113925 0.0e+00
## HR-E   0.01207385  0.010553657  0.01359405 0.0e+00
## NC-E  -0.08049766 -0.090340392 -0.07065492 0.0e+00
## NC-HR -0.09257151 -0.102427420 -0.08271560 0.0e+00

I found the median APR for all credit grades except AA increased after 2009.

CreditGrade median_before_2009 mean_before_2009 n.x median_after_2009 mean_after_2009 n.y
A 0.125520 0.1356993 3314 0.13799 0.1389094 14551
AA 0.096880 0.1061880 3495 0.09000 0.0900407 5372
B 0.156020 0.1643373 4387 0.18173 0.1840300 15581
C 0.180440 0.1934553 5646 0.22362 0.2261244 18345
D 0.213975 0.2255261 5152 0.28488 0.2805805 14274
E 0.269135 0.2707125 3288 0.33215 0.3305506 9795
HR 0.281790 0.2712610 3506 0.35797 0.3560612 6935
NA 0.219450 0.2265969 84984 0.18224 0.1959623 29059

If you look closer at the HR loans estimated return, they may have negative estimated returns in 2009 and 2010 but in normal economic times they are ussually the highest estimated return.

## $`2009`
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.18270  0.00500  0.10200  0.06753  0.13990  0.13990 
## 
## $`2010`
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -0.17730  0.06355  0.12220  0.08788  0.13690  0.13990 
## 
## $`2011`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.0887  0.1087  0.1148  0.1140  0.1246  0.1267 
## 
## $`2012`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1124  0.1221  0.1246  0.1228  0.1246  0.1271 
## 
## $`2013`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1043  0.1131  0.1135  0.1145  0.1185  0.1185 
## 
## $`2014`
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.09975 0.10430 0.10430 0.10490 0.10660 0.10660

Charged Off and Defualted loans tended to have a higher return estimate. Sadly, these loans most likely will not get paid off.

## $Cancelled
## NULL
## 
## $Chargedoff
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## -0.1816  0.1108  0.1246  0.1234  0.1440  0.2837 
## 
## $Completed
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.183   0.072   0.107   0.102   0.132   0.267   18410 
## 
## $Current
## NULL
## 
## $Defaulted
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  -0.046   0.109   0.127   0.123   0.144   0.254    4013 
## 
## $FinalPaymentInProgress
## NULL
## 
## $`Past Due (>120 days)`
## NULL
## 
## $`Past Due (1-15 days)`
## NULL
## 
## $`Past Due (16-30 days)`
## NULL
## 
## $`Past Due (31-60 days)`
## NULL
## 
## $`Past Due (61-90 days)`
## NULL
## 
## $`Past Due (91-120 days)`
## NULL

Seems Prosper facilitates mostly debt consolidation loans to c credti grades.

The C credit grades seems to be the most prevalent in most states.

Is seems that higher educated borrowers with higher paying jobs tend to have higher Credit Grades but still alot of high paying jobs still have bad credit ratings.

Most borrowers are granted a loan if they are employed or full-time employeed.

It would seem higher the borrower’s credit grade, the larger thier loan amount usually. On the other hand, the difference between NC and HR loan amount means seems to not to be significant based upon a post hoc variance test. It is also very interesting that the loan amount average is identical for AA, A and B credit ratings.

## $AA
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4851   10000   10620   15000   35000 
## 
## $A
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   11060   15000   35000 
## 
## $B
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    5000   10000   10900   15000   35000 
## 
## $C
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    4000    8500    9382   15000   25000 
## 
## $D
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3200    5000    6474   10000   25000 
## 
## $E
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    3000    4000    4286    5000   25000 
## 
## $HR
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    2000    3200    3124    4000   20000 
## 
## $NC
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1000    1000    2000    2316    3000   10000
##                     Df    Sum Sq   Mean Sq F value Pr(>F)    
## AllCreditGrades      7 9.064e+11 1.295e+11    4169 <2e-16 ***
## Residuals       113798 3.535e+12 3.106e+07                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 131 observations deleted due to missingness
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = LoanOriginalAmount ~ AllCreditGrades, data = Loans)
## 
## $AllCreditGrades
##             diff          lwr         upr     p adj
## A-AA    437.9988    218.68401   657.31367 0.0000000
## B-AA    275.3728     59.92821   490.81733 0.0027090
## C-AA  -1238.0426  -1447.85232 -1028.23281 0.0000000
## D-AA  -4145.7516  -4362.12101 -3929.38220 0.0000000
## E-AA  -6333.5255  -6565.76680 -6101.28419 0.0000000
## HR-AA -7495.6676  -7739.49364 -7251.84166 0.0000000
## NC-AA -8303.2314  -9737.02281 -6869.43995 0.0000000
## B-A    -162.6261   -336.57628    11.32415 0.0868471
## C-A   -1676.0414  -1842.96190 -1509.12090 0.0000000
## D-A   -4583.7504  -4758.84481 -4408.65607 0.0000000
## E-A   -6771.5243  -6965.89085 -6577.15782 0.0000000
## HR-A  -7933.6665  -8141.73723 -7725.59574 0.0000000
## NC-A  -8741.2302 -10169.37593 -7313.08450 0.0000000
## C-B   -1513.4153  -1675.21712 -1351.61355 0.0000000
## D-B   -4421.1244  -4591.34600 -4250.90275 0.0000000
## E-B   -6608.8983  -6798.88697 -6418.90957 0.0000000
## HR-B  -7771.0404  -7975.02767 -7567.05317 0.0000000
## NC-B  -8578.6041 -10006.16064 -7151.04766 0.0000000
## D-C   -2907.7090  -3070.74026 -2744.67782 0.0000000
## E-C   -5095.4829  -5279.05712 -4911.90875 0.0000000
## HR-C  -6257.6251  -6455.65178 -6059.59839 0.0000000
## NC-C  -7065.1888  -8491.90579 -5638.47184 0.0000000
## E-D   -2187.7739  -2378.81072 -1996.73707 0.0000000
## HR-D  -3349.9160  -3554.87985 -3144.95225 0.0000000
## NC-D  -4157.4798  -5585.17614 -2729.78341 0.0000000
## HR-E  -1162.1422  -1383.79608  -940.48822 0.0000000
## NC-E  -1969.7059  -3399.89370  -539.51806 0.0007845
## NC-HR  -807.5637  -2239.67836   624.55089 0.6813763

It is interesting that Credit Score for Credit Grade HR has two different spikes around 500 and 660 for the lower Credit Score Range. Credit Grade D also has two peaks around 560 and 660.

Homemakers are the most likely to have a DebtToIncomeRatio of over 10-1. Is this becuase it takes a large amount of capital to build a home before being able to sell it?

Although most people receive no recommendations for a loan, those with a lower DebttoIncome Range have a larger amount of outlier values with high numbers of recommendations.

When a investor loans money to a friend, although counter intuitive, it seems they were more likely to invest more with a friend with a very lower credit grade of HR.

Multivariate Plots

Charge-offs happened from people with a small to large number deliquences in the last 7 years meaning number of delinquencies would not be the best statistic to tell if someone was not going to pay there loans.

There is more very low or negative return outliers for High Risk loans in Auto, Business, Debt Consolidation, and Home Improvement categories. Their is also a very large distribution of Student Use estimated returns.

Seems the higher you credit score the more money lost when you do not repay your loan.

As one looks closer at the Estimated Returns, one finds that borrowers with lower credit grades have the potential to give higher returns in normal prosperous economic times. On the other hand, in bad economic times they also have the potential to yeild significantly lower to negative returns.

It is interesting that if you have over 35 lines of credit your credit score seems to be very volatile.

It makes sense as you get a better credit rating you have less delinquencies and a higher number of CurrentCreditLines.

It is interesting that the amount of money delinquent at its peek is so similar between credit rating A and HR.

It seems the number of recommendations a borrower recieved did not play a large role in the Estimated Return from their loan on average. At the same time it seems HR loans were alot more likely to have negative returns.

Final Plots and Summary

I found the median APR for all credit grades except AA increased after 2009. It was especially interesting that the mean APR for a AA credit grade borrower after 2009 decreased by about 15% while the APR for an HR credit grade borrower increased by over 31%. Could such drastic changes have been caused by the huge credit crisis of the 2009 recession? This chart alone shows me the significance of having a great credit rating!

CreditGrade median_before_2009 mean_before_2009 n.x median_after_2009 mean_after_2009 n.y
A 0.125520 0.1356993 3314 0.13799 0.1389094 14551
AA 0.096880 0.1061880 3495 0.09000 0.0900407 5372
B 0.156020 0.1643373 4387 0.18173 0.1840300 15581
C 0.180440 0.1934553 5646 0.22362 0.2261244 18345
D 0.213975 0.2255261 5152 0.28488 0.2805805 14274
E 0.269135 0.2707125 3288 0.33215 0.3305506 9795
HR 0.281790 0.2712610 3506 0.35797 0.3560612 6935
NA 0.219450 0.2265969 84984 0.18224 0.1959623 29059

As one looks closer at the Estimated Returns, one finds that borrowers with lower credit grades have the potential to give higher returns in normal prosperous economic times. On the other hand, in bad economic times they also have the potential to yeild significantly lower to negative returns. It is very important for investors to truly understand how difficult it is to get their money back at all if someones loan is chargedoff.

I wanted to better understand the shear magnitude of loses from charge-off loans since I am guessing many investors who use the ebay for loans don’t really understand the huge risks of specific loans. If you add up all losses investors made from charge-off loans you find investors lost $20,313,392 including lining Prospers pockets with $1,159,688. I would make sure to think twice before giving someone a loan who has a HR credit grade especially since Prosper loans still makes millions if you lose your life savings.

Charge_off_losses Prosper_Profits
20313392 1159688

When a investor loans money to a friend, although counter intuitive, it seems they were more likely to invest more with a friend with a very lower credit grade of HR. Could this be since they feel sorry for their friend and loan them money purely since they think no one else would logically lend them money? I would love to be able to ask these people why they are loaning there friend this a HR rating this money.

Reflections

I wished I could have had been given more background on each of the dataset’s variables. I truly tried to calculate how much money people overall made from all the loans but this was very difficult. I wanted to understand in general the overall payout of all loans for investors on Prosper. Sadly this dataset made this extremely difficult since I could not completely figure out how the data was created. For example, listing number 150265 had a Net Principal Loss of $7603.16 on a $20,000 laon but had non-principal recovery payments amounting to $21,117.90. How is this possible?

I would also love to have more data on the investors themselves. How much money do these investors make a year and what background do they have in investing? This could give me a better understanding on whether someone with no investing experience can make money investing on prosper loans or is more likely to lose large quantities of money.

Finally, I wish I had a way to compare prosper loan statistics to a normal banks loan statistics. I would love to understand how the lending practices differed between the two and if these differences changed the overall loan returns on average. Do banks refuse to lend out moeny to specific borrowers while prosper allows such loans since they still make money from facilitating extremely risky loans?